Improving Classification-Based Natural Language Understanding with Non-Expert Annotation
نویسندگان
چکیده
Although data-driven techniques are commonly used for Natural Language Understanding in dialogue systems, their efficacy is often hampered by the lack of appropriate annotated training data in sufficient amounts. We present an approach for rapid and cost-effective annotation of training data for classification-based language understanding in conversational dialogue systems. Experiments using a webaccessible conversational character that interacts with a varied user population show that a dramatic improvement in natural language understanding and a substantial reduction in expert annotation effort can be achieved by leveraging non-expert annotation.
منابع مشابه
Classification of telicity using cross-linguistic annotation projection
This paper addresses the automatic recognition of telicity, an aspectual notion. A telic event includes a natural endpoint (she walked home), while an atelic event does not (she walked around). Recognizing this difference is a prerequisite for temporal natural language understanding. In English, this classification task is difficult, as telicity is a covert linguistic category. In contrast, in ...
متن کاملFast semi-automatic semantic annotation for spoken dialog systems
This paper describes a bootstrapping methodology for semi– automatic semantic annotation of a “mini–corpus” that is conventionally annotated manually to train an initial parser used in natural language understanding (NLU) systems. We propose to cast the problem of semantic annotation as a classification problem: each word is assigned a unique set of semantic tag(s) and/or label(s) from the univ...
متن کاملCheap and Fast - But is it Good? Evaluating Non-Expert Annotations for Natural Language Tasks
Human linguistic annotation is crucial for many natural language processing tasks but can be expensive and time-consuming. We explore the use of Amazon’s Mechanical Turk system, a significantly cheaper and faster method for collecting annotations from a broad base of paid non-expert contributors over the Web. We investigate five tasks: affect recognition, word similarity, recognizing textual en...
متن کاملImproving Classification of Natural Language Answers to ITS Questions with Item-Specific Supervised Learning
In a natural language intelligent tutoring system, improving assessment of student input is a challenge that typically requires close collaboration between domain experts and NLP experts. This paper proposes a method for building small item-specific classifiers that would allow a domain expert author to improve quality of assessment for student input, through supervised tagging of a small numbe...
متن کاملOn Designing Controlled Natural Languages for Semantic Annotation
Manual semantic annotation is a complex and arduous task both time-consuming and costly often requiring specialist annotators. (Semi)-automatic annotation tools attempt to ease this process by detecting instances of classes within text and relationships between classes, however their usage often requires knowledge of Natural Language Processing(NLP) and/or formal ontological descriptions. This ...
متن کامل